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Abstract 

The constitutive expression of the high-risk HPV E6 and E7 viral oncogenes is the major cause of cervical cancer. To 
comprehensively explore the composition of HPV16 early transcripts and their genomic annotation, cervical squamous 
epithelial tissues from 40 HPV 1 6-infected patients were collected for analysis of papillomavirus oncogene transcripts (APOT). 
We observed different transcription patterns of HPV16 oncogenes in progression of cervical lesions to cervical cancer and 
identified one novel transcript. Multiple-integration events in the tissues of cervical carcinoma (CxCa) are significantly more 
often than those of low-grade squamous intraepithelial lesions (LSIL) and high-grade squamous intraepithelial lesions (HSIL). 
Moreover, most cellular genes within or near these integration sites are cancer-associated genes. Taken together, this study 
suggests that the multiple-integrations of HPV genome during persistent viral infection, which thereby alters the expression 
patterns of viral oncogenes and integration-related cellular genes, play a crucial role in progression of cervical lesions to 
cervix cancer. 
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Introduction 

Cervical cancer is the second leading cause of cancer-related 
death in women worldwide. The persistent infection by high-risk 
human papillomavirus (HR-HPV), such as genotype 16, 18, 31, 
33, 35, 39, 45, 51, 52, 56, 58, and 59 are essential for the 
progression of cervical lesions [1], and over 50% cases are caused 
by HPV 16 [2]. Viral oncoproteins, E6 and E7, of HR-HPVs 
contribute to cervical carcinogenesis by inactivating two major 
cellular tumor suppressor proteins, p53 and pRb, respectively [3- 
6]. These viral oncoproteins in infected cells can also result in 
chromosome instability and accumulation of mutation events [7]. 

A viral early promoter lied upstream of the E6 ORF, such as 
P97 in HPV 16 [8,9], P99 in HPV31 [10,1 1] and P105 in HPV 18 
[12,13], is responsible for almost all early gene expression, 
including E6 and E7. Upstream cis-elements in the LCR interact 
with cellular transcription factors and the viral transactivator/ 
repressor E2 and regulate the transcription of HPV E6 and E7 
genes [8,14]. Furthermore, DNA methylation [15], alternative 
RNA splicing [9,16,17] and early poly(A) site polyadenylation 
signal [18,19] also take part in the regulation of E6 and E7 gene 
expression [19]. 



To date, a full transcription map of oncogenic HPV 16 and 
HPV 18 in HPV-infected cells and raft tissues have been 
constructed [19,20]. 

It's well known that the integration of HPV genomes is a key 
event in cervical carcinogenesis [21,22]. Besides viral genome 
integration in activating cellular oncogenes or inactivating cellular 
tumor suppressive genes [23-25], HPV genome integration into 
host genome may change the transcription patterns of both viral 
and host genes [26]. It has been reported that the integration of 
HPV genomes can disrupt the viral E2 gene in cells and release its 
inhibition on the viral early promoter that controls the expression 
of E6 and E7 [27]. In addition, E6 and E7 transcripts 
cotranscribed with cellular sequences may be more stable, and 
thus enhance their expression level [28-30]. 

Transcription patterns of HPV 16 in the tissues of cervical 
cancer have been reported [26,31]. There were an episomal HPV 
early gene transcript (E7-E1"E4) and several integrated HPV 
transcripts (such as E7-Ercellular RNA, E7-El A E4-cellular RNA, 
etc.) in HPV 1 6-infected tissues. However, transcriptional selection 
in response to environmental changes is a dynamic process to 
achieve optimal gene expression for cell survival and carcinogen- 
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Figure 1. Specificity of APOT assay in detection of HPV16 oncogene transcripts. Amplified products from CaSki cells (A), HPV16-positive 
CxCa (B), HPV negative normal cervical tissues (C) and the "minus-RT" controls of the RNA isolated from HPV 16-positive samples (D) by the APOT 
assay were separated on 2.5% agarose gels. A: Lane 1-3 were three different subcultured CaSki cells (as a positive controls); B: Lane 1-3 were three 
different CxCa samples; C: Lane 1-3 were three different normal cervical tissues (as negative controls); D: Lane 1-3 were corresponding with samples in 
B, respectively; M: 250-bp DNA ladders. 
doi:1 0.1 371 /journal.pone.0097588.g001 



esis [32]. In this study, we applied a modified technique of 
amplification of papillomavirus oncogene transcripts (APOT) [26] 
to comprehensively explore the structure and sequences of HPV 16 
E7 related transcripts and their genomic annotation in 8 LSIL 
(low-grade squamous intraepithelial lesions), 24 HSIL (high-grade 
squamous intraepithelial lesions), and 8 CxCa HPV 1 6-positive 
cervical biopsy samples. 

Materials and Methods 

Patients and specimens 

Tissue samples of primary uterine cervical lesions containing 
dysplastic epithelium/tumor cells were collected from the Second 
Affiliated Hospital of Wenzhou Medical University (Zhejiang 
Province, China) from December 2010 to April 2012. The 
presence of HR-HPV was detected by HCII test, and the 
screening of HPV 16 in HR-HPV-positive samples was done by 
HPV genotypes detection kit (KaiPu, Guangzhou, China) [33]. All 
of them did not receive radiation therapy or chemotherapy before 
operation and each patient underwent a colposcopically directed 
biopsy. The collected biopsy specimens were bisected. One 
portion was submitted for standard histopathologic diagnosis, 
while the other portion was stored in RNAfater (Ambion, Austin, 
Texas, USA) at — 80°C for subsequent analysis. On the basis of the 
histopathologic diagnosis, the samples were divided into LSIL 
(CIN I, n = 8), HSIL (CIN II, n = 22; CIN III, n = 16) and cervix 
carcinoma (CxCa, n = 1 7). Additional 8 cervical tissues with 
normal cytology and HPV DNA negativity as controls were 
obtained from the patients who underwent hysterectomy owing to 
benign gynecologic diseases. The study has been approved by the 
Medical Ethics Committee of Second Affiliated Hospital of 
Wenzhou Medical University. All women were informed and 
gave their written consent to participate in the study. 



samples with an A260/A280 ratio of 1.8-2.0 and high integrity 
were used for the further experiment. 

Reverse Transcription and PCR Amplification of 
Transcripts 

APOT assay reported previously was based on nested PCR 
reactions [26], which could only amplify the abundant transcripts 
and ignore the transcripts with lower levels in samples. So 
modified APOT assay was used to amplify the HPV oncogene 
transcripts. The primers for these reactions were designed 
according to Klaes R, et al [26]. Total RNA (1 ug) was reversely 
transcribed using an oligo(dT) 17 -primer coupled to a linker 
sequence RT [34] according to the manufacturer's protocol of 
reverse transcriptase Kit (TOYOBO, Japan). To verify first-strand 
cDNA quality, PCR using glyceraldehyde-3-phosphate dehydro- 
genase (GAPDH) -specific primers were performed as previously 
described [35]. First-strand cDNA encompassing viral oncogene 
sequences were subsequently amplified by PCR using pl- 
HPV16E7 specific primer (5'-CGGACAGAGCCCATTA- 
CAAT-3') and linker pO (5'-GACTCGAGTCGACATCG-3') as 
the reverse primer; and the PCR amplification was carried out in a 
reaction volume of 50 ul Different from previous reports, the 
PCR cycles was increased to 35, and all specimens only performed 
one-round PCR reaction. To verify the specificity of this 
procedure, the "minus-RT" control in which reverse transcriptase 
was omitted from the reactions was also performed parallel. 
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RNA and DNA Isolation from Clinical Samples 

Total RNA from biopsy samples described above was isolated 
using TRIzol reagent (Invitrogen, Calif, USA) according to the 
manufacturer's instructions. To remove the residual DNA 
contamination, the RNA preparation was treated with Rnase-free 
Dnase I (Takara, Dalian, China) according to the manufacturer's 
protocol. Purified total RNA was dissolved in Rnase-free water 
and stored at — 80°C. The concentration and purity of total RNA 
were quantified by the ultraviolet spectrophotometer at 260 nm 
and 280 nm and 1% agarose gel electrophoresis. Only RNA 
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Figure 2. Three types of HPV 16 early gene transcription 
pattern. Type A shows E1 sequences spliced directly to cellular 
flanking sequence; Type B shows E1 spliced to E2, with E2 fused with a 
cellular sequence; Type C shows E1 spliced to E4, with E4 running into a 
cellular sequence. A , there are two integration sites in E1 (data shown in 
Figure S3). The boxes within slashes represent six nucleotides between 
E7 and Elgene. 

doi:10.1371/journal.pone.0097588.g002 
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Figure 3. Detection of HPV 16 early gene transcripts by the APOT assay. HPV16-positive clinical samples with LSIL, HSIL and CxCa were 
subjected to the APOT assay and separated on 2.5% agarose gels, Lane 1-5 mean five different samples in each pathological type. M: 250-bp DNA 
Ladders. 

doi:1 0.1 371 /journal.pone.0097588.g003 



Sequence Analysis of Transcripts 

The APOT amplification products were visualized by 2.5% 
agarose gel electrophoresis. PCR products of interest were excised 
from the gel and extracted using DNA agarose gel recovery kit 
(TianGen, Beijing, China). The corresponding amplimeres were 
cloned into cloning vector (TransGen, Beijing, China) and DNA 
sequence analysis was executed using an ABI 3730 XL Genetic 
Analyser (Applied Biosystems, USA) according to standard 
protocols. Sequencing results were analyzed using the BLASTn 
program provided by the National Center for Biotechnology 
Information, USA. Additionally, the chromosomal integration 
sites were ascertained using the National Center for Biotechnology 
Information (BLAST) and European Molecular Biology Labora- 
tory (EBI). Moreover, the fragile sites and genes of integration sites 
were defined using the NCBI fragile site map viewer and the 
UCSC Blat tool. 

Results 

Specificity of APOT assay for HPV1 6 oncogene transcripts 

The principle of the APOT assay is a 3' rapid amplification of 
cDNA ends (RACE) PCR assay that achieves amplification and 
cloning of the region between a single short sequence in a cDNA 
molecule and its unknown 3'- end [34]. In general, the integrated 
transcripts derived from E6 and E7 oncogenes encompass viral 
sequences at their 5'- ends and host genome sequences at their 3'- 
ends [26]. The expected size of products obtained from an 
episome-derived transcript is 1050 bp [26] Amplimers that 
displayed a size different from 1050 bp may therefore be derived 
from an integrated HPV genome. To testify the specificity of the 
modified APOT assay, cDNAs from HPV 1 6-positive Caski cell 
contains the integrated HPV 16 genome and HPV-negative 
normal cervical tissues, as well as the "minus-RT" controls in 
which reverse transcriptase was omitted from the reactions were 
used. The amplified products of the cDNAs from HPV 1 6-positive 
Caski cell were similar to the previous report [26], whereas no RT 
product was obtained from the normal cervical tissues without 
HPV DNA and the "minus-RT" control (Figure 1). These data 
indicated the modified APOT assay can specifically detect the 
transcripts derived from the integrated HPV genome. 



Characteristics of HPV16 oncogene transcripts in the 
tissues of cervical intraepithelial neoplasia and cervix 
carcinoma 

To analyze the HPV 16 oncogene transcripts, 40 HPV 16- 
positive cervical specimens (LSIL, n = 8; HSIL, n = 24; CxCa, 
n = 8) with good quality RNA were selected among 63 collected 
samples in this study. Total 133 transcripts containing viral 
fragments were found. Among these transcripts, 64 fragments had 
HPV16 E7-E1* sequences at their 5'- ends and directly connected 
with poly A at their 3'- ends (Figure SI). Furthermore, there were 
four different disruptions of El region at nt 880, 949, 1054 and 
1234 (Figure S2). The transcripts containing an El-splice donor 
signal at nt 880 [36] might belong to potential episomal pattern, 
whereas the transcript which truncated at nt949 might be a result 
of internal priming by oligo dT [37]. Other transcripts which 
truncated at nt 1054 and 1234 neither contained poly A 
sequences, nor any polyadenylation site belong to viral or host, 
so these transcripts were viewed as potential integrated patterns. In 
addition, we also found another transcript which has E7 ORF 
spliced at nt 880 to the E4-splice acceptor site at nt 3358 and then 
spliced from the E4-splice donor signal at nt 3632 to the Ll-splice 
acceptor site at 5639, and also terminated at poly A (Figure SI). In 
this transcript, the E4 ORF is not disrupted. Lack of a splice donor 
signal at nt 5815 in this transcript indicates that the HPV 16 
genome disrupted at nt 5815 might also take part in the virus 
genome integration. 

In addition, there were 64 viral transcripts directly connected to 
host genome sequences and they were all began with the 
beginning of the forward primer (pi) at nt 729. These HPV16 
oncogene integrated transcripts could be divided three different 
types (Figure 2). Among these transcripts, Type A has HPV16 E7- 
El sequences at their 5'- ends and directly connected to host 
genome sequences. However, there were two different integration 
sites of El region (at nt880 and ntll07) in this type (Figure S3). 
The site at nt880 contained an E 1 -splice donor signal while the site 
truncated at ntll07 might be more likely to linearize the viral 
circular genome for integration into the host genome. Transcript 
type B has an E2 ORF disrupted at nt2870 and the Type B 
sequence composes of HPV16 E7-ELE2 at its 5'- ends and the 
host genome sequence at its 3'- ends. In transcript type C, the 
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Figure 4. The proportion of patient samples that contain the 
different type of transcripts. 

doi:1 0.1 371 /journal.pone.0097588.g004 

EFE4 stop codon is disrupted for virus integration and an entire 
E1 A E4 ORF without a stop codon is fused in frame to host 
sequence. Among these three patterns, transcripts of Type A and 
C had been reported by Wentzensen N, et al. [31]. However, 



Table 1. The Transcript Number of Three Transcription 



Patterns 


in the Groups of LSIL, HSIL and CxCa. 






Different pathological types 


Pattern 


Total LSIL HSIL 


CxCa 


Type A 


48 3 35 


10 


Type B 


10 1 1 


8 


Type C 


6 


6 



64 transcripts in total, directly connected to host genome sequences at their 3' 
ends, were detected from 40 HPV 16-positive cervical specimens, including 8 
LSIL, 24 HSIL and 8 CxCa. Type A and B were detected in all HPV 16-positive 
samples while type C was only found in CxCa. no transcripts. 
doi:1 0.1 371 /journal.pone.0097588.t001 



transcript of Type B had not previously been reported in 
precancerous lesions and cervical cancer. 

Moreover, HPV 16 oncogenes showed significantly different 
transcription patterns in the tissues of LSIL, HSIL and CxCa 
(Figure 3, 4 and Table 1). Among these 3 transcription patterns 
detected in our patients, the Type A and Type B were higher 
prevalence than Type C, which were observed in almost all 
pathological types, whereas the Type C was detected only in the 
samples of CxCa, with a detection frequency of 75% (Table 1 and 
Figure 4). All patient samples displayed the Type A, but all CxCa 
samples had the Type B and Type C (Figure 4). Consistent with 
the presumption of potential integration of the viral genome in the 
later stages of cancer development [38,39], the prevalence of 
fusion transcripts were higher in HSIL and CxCa than LSIL. 

Integration sites and characterization of the cellular 
flanking sequence 

To identify the individual chromosomal locations, all 64 fusion- 
transcripts containing viral and cellular sequences were further 
analyzed by BLASTn comparisons to the whole genome database. 
Our data show that all chromosomes, except for Chr21 and X, 
were integrated with HPV 16 genome, confirming the previous 
reports that no preferential HPV integration site was seen in 
selection of the human chromosome [40]. Some loci, such as 
lp36.22, lp36, 2p24, 2q33, 5q31.1, 5q31, 6p24, 8p23, 10q22.1, 
13q22.1, 19ql3and 19pl3. 3, were reported previously [31,41-45] 
(Table 2). Among these integration events, fourteen of 40 samples 
exhibited multiple integration sites (Table 2). Although local DNA 
rearrangements could happen frequently and rapidly after the 
integration [43], we found that cellular flanking sequences in 11 
tissues were mapped to different chromosomes, indicating the 
presence of multiple independent integrations in these samples. 
Moreover, we found that multiple-integration events were 
significandy higher in CxCa tissues (75%) than in the cervical 
tissues of LSIL (50%) and HSIL (53.8%). Screening of aU 
integration loci indicates that 35 of 63 mapped integration sites 
were located in or close to a fragile site with a distance of 26 bp to 
5 Mbp (Table 2). Among the 22 mapped fragile sites, FRA13A 
was found in 4 independent samples. Twenty-two transcripts were 
not associated with any fragile site. 

The cellular flanking sequences of viral-cellular fusion tran- 
scripts were further examined for known genes. Most of these 
fused transcripts had a cellular sequence from the coding 
orientation of known genes and thirty transcripts had the cellular 
sequence from an intron region, and 8 transcripts were fused with 
a sense exon sequence of the predicted genes (Table 2). Among 
these predicted genes, AMICA1, DAPK1, EBAG9, PIBF1 were 
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affected twice, MRPS31 four times and PRDX5 even six times by 
the viral integration. At the same time, the nearest host genes to 
each integration site in the direction of transcription were also 
analyzed (Table 2). Among these predicted genes integrated or 
closed to the integration site, we identified several tumor- 
associated genes, including PRDX5, CD28, ROCK2, RHOH, 
TIA1P3 and DAPK1, etc. As shown in Table 1, the transcripts 
type D and E were only detected in CxCa and most of their 
integration loci were located in or close to the fragile sites of 
FRA13G, FRA22B, FRA2I and FRA13A. The genes associated 
with the transcripts type D and E were oncogenes (CD28 and 
EBAG9), tumor suppressor genes (TIMP3), or tumor-related genes 
{PIBFland MRPS31). 

Discussion 

Integration of HPV genome into host chromosomes represents 
an early clonal event to provide an additional selective advantage 
for the expansion of the neoplasm. Viral transcripts have been 
detected by the APOT assay [26,31,42-45]. Although APOT 
assay has some advantages in detection transcripts from each 
chromosome integration site, there are several limitations. First, it 
is difficult to amplify very long integration-derived transcripts, 
which will underestimate the number of tumors with integrated 
HPV DNA [45]. Second, APOT is one type of nested PCR, which 
may tend to amplify the transcripts with higher levels and ignore 
those with lower levels. Third, It has been reported that the 
internal poly A priming could replace the oligo(dT) primer within 
certain limits, and generating a set of anchored oligo(dT) primers 
for cDNA synthesis. These sequences caused by internal priming 
interrupted the generating of full-length cDNA and confused the 
analysis of alternative splicing [37]. With our modified APOT 
assay to detect the transcription pattern of the cervical tissues, we 
did find many viral transcripts connected with poly A or host 
genome sequences in HPV16-infected cervical squamous epithelial 
tissues. We noticed that there were a lot of viral transcripts directly 
ended with poly A at their 3'- ends. Except for the reported El- 
splice donor signal site (nt 880), the truncation sites at ntl054, 
1234 and 5815 neither contained internal poly A sequences nor 
any polyadenylation signals should be potential novel integrated 
sites and need for further analysis. The viral-cellular fusion 
transcript of type A and C has been reported previously 
[26,31,41]. In the Type C transcript, the integration disruption 
of E4 termination codon would result in the E4 to use a host 
termination codon. In this study, we also noticed that some 
cervical cancer samples contained all three types of transcripts 
were viral-cellular fusion transcripts. 

HPV 16 transcription patterns in LSIL, HSIL, and CxCa were 
significandy different. We found that the Type C transcript was 
only detected in the samples of CxCa and more random 
integration sites existed in our tissue samples. Similar to previous 
reports [31,38-42,46,47], our study indicates that HPV integra- 
tion has no preferential site in the human genome. Except for 
chromosome 2 1 and X, other chromosomes are all susceptible to 
HPV16 integration. Approximately 55% integrations are located 
in or close to a fragile site. Different from previous reports [42,45], 
we noticed that integration events often occur multiple times 
significantly more in cervical cancer than in LSIL and HSIL. 
These data not only provide biological support to the epidemi- 
ologic observation that persistent infection by specific types of HR- 
HPV is the important cause of cervical carcinoma [1], but also 
indicate that subsequent selection for and accumulation of 
mutations in yet-to-be-identified key cellular regulatory genes 
promotes further progression to cervical cancer. 
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The integration not only changes the transcription pattern 
relevant for the dysregulated expression of the viral oncogenes, but 
also affects the expression of the host gene with virus genome 
integration. The integration alters the expression of host genes in 
integration sites, even if this occurs within the intron sequences 
[43,45]. In our study, we identified a broad spectrum of cancer- 
associated genes in the integration sites and flanking sequence 
regions. Most of genes in the integration sites were associated with 
tumor development, and nineteen genes were strongly related to 
cervical cancer. Some of them act as tumor suppressors (such as, 
miR-34a, MSH2, WWOX and TIMP3, et at) or oncogenes (such as, 
R0CK2, CD2S, EBAG9 and ANGPT1, et at). Interestingly, most of 
them were not reported in previous documents [31,45]. MiR-34a, 
an important tumor suppressor, is down-regulated in cervical 
cancer [48,49]. It has been reported that oncoprotein E6 of 
HPV16 and HPV18 can inhibit the expression of tumor- 
suppressive miR-34a by destabilization of p53 and resulted in cell 
proliferation [50]. The disruption of miR-34a gene might further 
interpret the phenomenon of reduced expression of miR-34a in 
cervical cancer. MSH2 is a DNA mismatch repair protein, and 
associated with DNA repair pathway [51,52]. Decreased expres- 
sion of MSH2 might be a risk factor in the early stage cervical 
cancer [53]. ROCK2, an important signaling molecule, can 
promote cervical cancer metastasis by upregulating and activating 
the expression and function of moesin protein through RhoA/ 
ROCK2 pathway [54]. Besides the cancer-associated genes, the 
genes in integration sites and flanking sequence regions might be 
also beneficial for viral genome integration. FANCM which is a 
DNA translocase and highly related to DNA replication regulates 
checkpoint signaling and replication fork progression [55,56]. 
Other genes, such as COX6B1 is related to cell apoptosis [57] and 
ESRRA also have been reported associated with cervical cancer 
[58]. In addition, among 45 integration events, 13 events led to 
antisense transcription of the coding sequences, such as PRDX5, 
EBAG9 and CD28, etc. These integrations were generally deemed 
of no interest. However, their sense sequences were associated with 
DNA restoration or tumor development and might affect both 
host and viral gene expression during the development of cervical 
cancer. The most integration in the antisense orientation was the 
gene encoding peroxiredoxin 5 (PRDX5), a protective emzyme 
against oxidative stress [59,60]. Its altered expression due to 
HPV16 integration could have significant virological consequence, 
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